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Deriving rigorous bounds for the time scales that are needed for thermalization forms one of the 
most vexing problems when it comes to understanding statistical mechanics from the principles of 
quantum mechanics. One central aspect in obtaining such bounds is to determine how long a system 
retains memory of its initial conditions. By viewing this problem from an quantum information 
theory perspective, we are able to simplify part of this task in a very natural and easy way. We 
first show that for any interaction between the system and the environment, and almost all initial 
states of the system, the question of how long such memory lasts can be answered by studying 
the temporal evolution of just one special initial state. This special state thereby depends only on 
our knowledge of macroscopic parameters of the system. We provide a simple entropic inequality 
for this state that can be used to determine whether mosts states of the system have, or have not 
become independent of their initial conditions after time t. Analyzing the rate of entropy change 
over time for a particular kind of interaction then allows us to place rigorous bounds on such time 
scales. We make a similar statement for almost all initial states of the environment, and finally 
provide a sufficient condition for which a system never thermalizes, but remains close to its initial 
state for all times. 



We are all familiar with thermalization on a macro- 
scopic level - simply consider what happens when you 
leave your cup of coffee untouched for a while. Yet, un- 
derstanding this process from a microscopic level forms 
a challenging endeavour. How could we hope to justify 
thermalization from the rules of quantum mechanics? 

To tackle this problem it is helpful to break it up into 
smaller, more manageable, components. As [1] point out, 
the straighforward-looking process of thermalization ac- 
tually consists of four aspects which may be addressed 
independently. Roughly speaking, they deal with several 
different questions that we might ask about a system S 
after it is placed into contact with an environment (bath) 
E. The first of these is whether the system equilibrates. 
That is, does it eventually evolve towards some particu- 
lar equilibrium state and remain close to it? Note that 
when we only ask about equilibration, we do not care 
what form this equilibrium state actually takes. In par- 
ticular, it may depend on the initial state of the system 
and/or the environment and do not need to be a thermal 
state. A second question is thus whether this equilibrium 
state is indeed independent of the intial state of the sys- 
tem. Note that one may also think of this question as 
asking whether the system retains at least some amount 
of memory of its precise initial conditions in equilibrium. 
Similarly, the third question asks whether the equilibrium 
state depends on the precise details of the intial state of 
the environment, or only on its macroscopic parameters 
such as temperature. Finally, if we find that the equilib- 
rium state of the system is indeed independent of such 
initial states, we may then ask whether it actually takes 
on the familiar Boltzman form. 
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However, there is of course one more pressing question 
when it comes to thermalization: just how long does it 
take for a system to thermalize? Indeed, the problem of 
deriving rigorous bounds for the time scales which are 
needed for thermalization has recently been called the 
most important open problem in the project of justifying 
statistical physics from first principles of quantum me- 
chanics |3] . Only rather weak bounds are known on 
such time scales in general [4], alongside some analyti- 
cal [51 H] and numerical [T^fS] results for specific systems. 

Note that we may pose the question of time-scales to 
each of the aspects above individually. For example, we 
could ask not only if a system loses memory of its initial 
state, but just how fast this memory loss occurs. This 
is the approach we will take here, where we indeed focus 
on the system's memory of the initial conditions which 
plays a crucial role in understanding thermalization [S]. 
To study the question of time-scales it is thereby not 
enough to study long time averages [T], but ideally we 
want to make statements about the actual state of the 
system at a particular fixed time t. Instead of asking 
questions about the equilibrium state, we thus ask 

• Independence of the initial state of the system. At 
time t, does the state of the system depend on the 
precise initial state of the system? (or only on its 
macroscopic parameters?) 

• Independence of the initial state of the environ- 
ment. At time t, does the state of the system de- 
pend on the precise initial state of the environment? 
(or only on its macroscopic parameters?) 

Since independence of such initial conditions is a neces- 
sary condition for the system to be in a thermal state, 
analyzing said time-scales places a lower bound on the 
time that it takes to thermalize. 



Before stating our results, let us first describe our setup 
more carefully. Consider a system S and an environment 
E described by Hilbert spaces Hs and He respectively. 
Macroscopic constraints imposed on the system or the en- 
vironment take the form of a subspaces Hn^ C Hs and 
QHe respectively. Before placing them into con- 
tact, the system and the environment are uncorrelated. 
That is, the initial state of T-Ls at time t = takes 
the form \4>) S ® \''P) e , where to explain our result we will 
for simplicity assume that \(f))s & "Hag and |V')_e € "Hne 
are pure states [35] • Interaction between the system and 
the environment is governed by the Hamiltonian Hse- 

We will also need to quantify the amount of entropy 
in the system and its environment. The relevant quan- 
tities for a single experiment (a.k.a. single shot) are the 
min- and max-entropies, well established in quantum in- 
formation theory. For a single system these can easily 
be expressed in terms of the eigenvalues {Xj}j of the 
state PS = J2j as Hmin('S')p = - logmaxj and 

Hmax(<5')p = 21og^^ Both quantities enjoy nice 

operational interpretations in quantum information [lOj 
as well as thermodynamics [TTJ [T^] . We will also refer 
to smoothed versions of these quantities H^j^^ and ii^nx 
which can be thought of as equal to the original quan- 
tity, except up to an error e. We refer to the appendix 
for a detailed introduction. Both quantities converge to 
the von Neumann entropy H(S')p = — Xj log Xj in the 
asymptotic limit [13 of many experiments. 

Finally, when we say that two quantum states p and a 
are close, we mean that their trace distance (t||i with 
||A|ji = tr V A^A is very small. As the trace distance 
quantifies how well we can distinguish p from a when 
given with equal probability jl4j . this says that there 
exists no physical process that can easily tell them apart. 



I. RESULTS 

We are now ready to state our results. We emphasize 
that whereas our explanations here are rather informal 
for the purpose of illustration, our results are fully 
rigorous and precise statements as well as technical 
details can be found in the appendix. 

Independence of the initial state of the system We 

first consider the role of the initial state of the system. 
Let us thus fix the state of the environment \tp)E- Before 
embarking on the study of time-scales, we first show our 
main result, namely that we can considerably simplify 
the problem for any interaction Hqe- In particular, we 
show that whether the system has become independent 
of its initial state |(/))s € after time t can, for almost 
all initial states, be decided by analyzing the temporal 
evolution U{t) ~ cxp{~iHsEt) of just one special state 
given by 




Time 

FIG. 1: If for the state rssit) the entropy of the environment 
exceeds the entropy of the system at time t, then the system 
has lost memory of its initial state and is close to Ts{t), for 
almost all possible initial states \(p}s- Note that we make no 
statement about equilibration merely that almost all states 
"follow" Ts{t), which may or may not equilibrate. 



where Trn^ denotes the maximally mixed state on "Hsig . 
More precisely, we prove that if for a particular time i, 
we have that 



(2) 



then the system will be independent of its initial state, 
for all but exponentially few initial states |(/>)s. All but 
exponentially few thereby means that volume of states on 
T-Lfig that does not have this property is negligible, as we 
will outline in more detail in the methods section below. 
In this case, the state of the system will be close to Ts{t) 
in trace distance, as illustrated in Figure [1] Note that 
this state only depends on the macroscopic constraints 
of the system, and not of any of the individual initial 
states that we might place the system in. However, if 



HLax(^)r < HLin('5')7 



(3) 



TSEit)^Uit) {7rns(S>\^) (Me) U{t)^ 



(1) 



then for a substantial fraction of initial states \(j))s the 
system does (still) depend on it. Indeed, this is the case 
at time t — and it does of course depend on the details 
of Hse whether ^ can ever be satisfied at a later point 
in time. Our condition is tight up to differences in min- 
and max-entropies, which vanish in the asymptotic limit 
where both quantities converge to the von Neumann en- 
tropy. This limit is relevant e.g. when considering quan- 
tum memories. Our result easily extends to the case 
where the initial state of the system is correlated with 
a reference system (i.e. it is not pure). We emphasize 
that our result is entirely different from [15] which makes 
statements about "almost all" states on S and E. 

Note that for ([2| to hold, we need the environment 
to be of sufficient size to "absorb" the entropy of the 
system. It is easy to make this idea precise and show 
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that for any \ip) e and any interaction Hamiltonian Hse 
if the system is sufficiently large (i.e., logdn^ > 2\ogdE) 
then the state of the system always depends on its initial 
state, for almost all initial states |(/))s. 

We note that the criterion presented above is tight up 
to differences between smooth min- and max-entropies 
and becomes literally tight in an i.i.d. scenario where 
each of a large number n of systems S is undergoing the 
same the same interaction with its local environment. 
In this limit, a tight criterion in terms of the von 
Neumann entropy can be given, as discussed in detail 
in the appendix. Physically, we interpret this limit as a 
quantum memory "Hg" suffering the influence of noise. 

Independence of the initial state of the environ- 
ment We proceed to show a similar statement about the 
role of the initial state of the environment. Let us thus 
now fix the state of the system |0)s- We show that to 
determine whether at time t the system depends on the 
initial state of the environment can, for almost all initial 
states \^)e G 'He, be decided by considering the state 

TSE{t) = U{t) {\4>){<j,\s ® TTo J U{t)^ , (4) 

where tthe denotes the maximally mixed state on T-Lns ■ 
In particular, we again obtain an entropic condition. If 
for a particular time t 

HLin(^)T ^ Hfjj^^(S')^ , (5) 

then the system is independent of the initial state of the 
environment, for all but exponentially few initial states 
\^)e- In particular, the state of the system at such times 
is very close to Ts{t), which only depends on macroscopic 
constraints of the environment. If on the other hand 

HLax(-^)r ^ HLin('S')T , (6) 

then a substantial fraction of initial states of the envi- 
ronment lead to different states of the system at time 
t. Our condition is again tight up to differences in min- 
and max-entropies of the state TsEit), which vanish in 
the asymptotic limit. 

As before, our result easily leads to a statement 
we are already familiar with. Namely, that if the 
environment is very large compared to the system (i.e. 
logc?o^ > 21ogd5') then the state of the system will 
be the same for all but exponentially few initial states 
of the environment restricted to Hn^. Note that the 
condition "all but exponentially few" is not a mathe- 
matical artifact of our proof. For some examples, one 
can find very specific initial states of the environment 
that will lead to observable effects on the system even 
if the environment is large. A similar statement was 
shown before for almost all times [T]. Since our result 
holds for all times, it does in particular imply said result. 

Time-scales In physical situations we usually expect the 
environment to be dimesion-wise much larger than the 
system. Following the above discussion, the criterion of 



independence of the initial state of the environment is 
therefore of no use to investigate the time-scales on which 
thermalization happens; hence we focus on indepence of 
the system's own initial state. We can guarantee that 
the system still "remembers" its initial state as long as 
([3]) with TsE{t) as defined in ([I]) is fulfilled. It is thus 
interesting to study how fast iifj-^in{S)r decreases from 
its initial value logdn^ and how fast }l'^^^(E)r increases 
from zero. The answer to this question of course depends 
on the speficic model under consideration. In a physical 
model with only local interactions we expect, for exam- 
ple, that the entropies in S and E respectively can be 
changed the faster, the larger the interaction surface be- 
tween S and E is. While the rate of change of the von 
Neumann entropy has been studied intensively |16H18j . 
no such results exist for the one-shot entropies H^j^^ and 
H^jjj^. As an illustration of our method, we derive a sim- 
ple bound for all Hse without taking locality constraints 
into account. Namely we show that changes of H^j;jj(S')7. 
and }l^^^{E)r need a lot of time if either the initial state 

TsE{0) = TTns<»\^){i^\E (7) 

is close to commuting with the Hamiltonian Hse or if 
the interactive part of the Hamiltonian |39j 

Hint ■■= Hse -Hs^Ie-Is^He (8) 

is weak, if measured in the usual operator norm. Apply- 
ing these findings to criterion ([3]), we find that S retains 
at least some memory about its initial-state for times of 
order 

^ ('"^^ { 4 WHintlL ' II i^ns ® W {^\e, HsE]h }) ■ 

(9) 

A toy model shows that this simple bound is indeed tight 
up to some constant factor. Still we expect to find better 
bounds in cases where we do impose locality assumptions 
on the Hamiltonian Hse- In this case, results of the Lieb- 
Robinson type like |19j may be applied to bound the rates 
with which the min- and max-entropies can be changed. 
Our criterion hence opens the door to improved lower 
bounds on the thermalization times. 

Recent work tackled the problem of thermalization 
time-scales from another angle: in [20] sufficient time- 
scales for equilibration were derived. In contrast, note 
that we are interested in necessary time-scales for 
thermalization. 

Absence of thermalization Finally, we consider the 
question whether it is at all possible for the system to for- 
get about its initial conditions, even if the environment 
is large. In [T] it is shown that the temporal average of S 
will be independent of its initial state if the relevant en- 
ergy eigenstates of Hse are sufficiently entangled. Here, 
we prove a converse result. That is, we provide sufficient 
conditions under which a system never becomes indepen- 
dent of its initial state, for all but exponentially few initial 
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states of the environment. Our result extends a recent 
result of |21j , which we compare to ours in the appendix. 
The most important advantage of our result is that we 
can make statements about the time-evolved state of S 
(as opposed to statements about temporal averages) and 
do not require S to be small. Roughly, we show that 
if those eigenstates of the Hamiltonian which on S have 
most overlap with the initial state are not sufficiently en- 
tangled, the state of the system will remain close to its 
initial state for all times, for all but exponentially few 
initial states \iP)e of the environment. 

Let us now explain more precisely what we mean 
by these conditions. Note that the energy eigenstates 
{\Ek)sE}k=i dsdE fo'^'^ ^ basis of the product space 
'Hs®T~L E ■ Assume that we want to approximate this basis 
by a product basis {\i)s <^ |j)£;}i=i,...,ds,i=i,...,dE • That 
is, to each energy eigenstate \Ek)sE we assign the ele- 
ment of the product basis \i)s ^ |j) e which best approx- 
imates it and assume that this correspondence is one-to- 
one. Let denote the set of energy eigenstates which 
are assigned to a state of the form \i)s ^S) |j)_e, with a 
fixed i and an arbitrary j. We introduce the quantity 
6{i) to quantify how well the energy eigenstates in 
are approximated by an element of the product basis, 

S{t) niin . max {\{Ek\sE\^)s\J) e\} ■ (10) 

\E,,)eI{t)] = l...-,dE 

Let ps{t) denote the state of the system at time t and 
assume that its initial state was ps(0) = |«)(i|s. Then at 
any time t the probability that ps{t) is further away from 
its initial state than 4:S{i)y/l — S{iy (in trace distance 
||...||]^) is exponentially small. This radius is small if 
6{i) is close to 1, that is, if the enery eigenstates which 
on S are most similar to |i)(i|s are sufficiently close to 
product. The probability is computed over the choice of 
the initial state of the environment \'iI')e- 



II. METHODS 

Let us now explain the main conceptual idea in prov- 
ing our results on initial state independence - our results 
on time scales and absence of thermalization then follow 
using involved, but relatively standard technical meth- 
ods as outlined in the appendix. For simplicity, let us 
thereby first consider the question whether the system 
depends on its initial state |<^)s G Hag at time t. The 
key idea to our proof is to take an information theoretic 
standpoint and look at this problem from the perspective 
of an outside reference system R who prepares the sys- 
tem S in its initial state. That is, R and S are initially 
correlated which, in the simple case that S is pure, can 
be understood as R having prepared S* in a definite state 
\(j))s- Instead of asking whether the system still depends 
on its initial state at time t, we can now equivalently ask 
whether R still "knows" about the system state at time 
t, or whether R has become decoupled from S. 



A central theorem in quantum information theory 
known as the decoupling theorem [22l ES] quantifies 
whether a particular process, i.e. a quantum channel 
Ta-¥B acting on the system A can have this decoupling 
effect. Originally devised to demonstrate the existence 
of certain encoding schemes to preserve quantum infor- 
mation 24J, we will employ it here in a different context. 
Using techniques from measure concentration and the de- 
coupling theorem of we obtain that for any £ > 0, 
(5 > 

Pr {WTa^b (10) ((^U) - Ta^b (tta) 111 > 2-^/2 + 12e + s] 
< 2exp(-dA'5Vl6) , (11) 

where the probability is taken over the Haar measure, 

T^A = lyi/c^A, and 

C-Hf„i„(A'|i?), (12) 
>HL(A'i3).-HLx(i3).-0(^log^) . (13) 

The state ta' b appearing in the last expression is the 
Choi-Jamiolkowski representation of the channel Ta^b 

TA'B = {Ia(^Ta^b)\'^){'^\a'A , (14) 

where |^')a'a is the maximally entangled state across A 
and A' and Ia denotes the identity channel. Clearly, to 
obtain a strong bound, we need C to be big for a small e. 
Intuitively, ^ measures "how good the channel Ta^b is 
at decoupling" , i.e. at destroying potential correlations a 
(non-pure) input-state on A might have to some outside 
reference R. In particular, the bound on C, tells us that 

e e 

for H^j;jj(A'i?)T- > IImax(i?)r almost all states \4>)a have 
become indepndent and are close to t^i. A more general 
statement involving mixed input states can be found in 
the appendix. 

In the appendix, we derive a statement converse to 
the above which we briefly sketch here: We show that 
if W^-^^y.[A' B)r < H^;jj(i?)T-, then there is no state ub 
which is such that most input states |(/))(<^|yi yield a chan- 
nel output Ta^b {\4'){4'\a) which is close to it. 

To see how we can apply this result to our situation, 
note that for the product initial state \4>)s ® \'^)e with 
|0)s e 'Hos C 'Hg and \4i)e & "HnE ^ '^e we find the 
state of the system after time t to be 

ps{t)=ti-E[U{t){\<j>){^\s^\^P){ij\E)U{t)^] . (15) 

We can look at this either as a quantum channel Ts^s 
(taking \(j)){(j)\s as an input) or as a quantum channel 
Te^s (taking \'>P){4'\e as an input). The first channel 
captures the influence that the initial state of the sys- 
tem has on the system state at time t. The latter cap- 
tures the influence of the initial state of the environment. 
Applying the above results to these channels and using 
some basic properties of the smooth entropies then yields 
our results about independence of the initial state in a 
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straightforward manner. For example, for 7s->s, we ap- 
ply our theorems with A = S and B = S. Note that a 
purification of ts>s is simply obtained by ommitting the 
trace over E in the channel, i.e., 



rs'SEit) = UsE{t)m{-^\s'S \i')WE)UsE{ty . (16) 



Tracing over S" then gives the special state TsE{t) 



Hmin(-E) yielding the claimed entropy conditions. 



We have shown that the problem of finding lower 
bounds on thermalization times can be simplified con- 
siderably - for almost all states it suffices to understand 
the temporal evolution of the special state tsei or rather 
changes in entropy for this state. Whereas our general 
example bounds for entropy changes are rather weak, our 
result opens the door for more sophisticated methods to 
be applied, taking into account the locality of the Hamil- 
tonian. The emergence of a special state tqe is indeed 
somewhat analogous to the setting of channel coding, 
where the maximally entangled state plays an important 
role in quantifying a channels capacity to carry quantum 
information. Note, however, that we do not ask about 
how much quantum information could be conveyed by 
using any form of coding scheme. Furthermore, merely 
asking whether the state of the system depends on its 
initial state after some time, or in more information the- 
oretic terms, asking whether the output state of the chan- 
nel depends on its input state does (unlike in the classi- 
cal world) not immediately answer the question whether 
this channel is useful for transmitting quantum informa- 
tion [3D]. 

Furthermore, note that all our statements hold "for al- 
most all initial states from the Haar measure", i.e., we 
make statements about the volume of states. Of course, 
from a given starting state it is in general not the case 
that all such states could be reached in a physical system, 
and hence one might question the relevance of our results. 
Note, however, that our approach applies to any set of 
unitaries which have such a decoupling effect. In [25 it 
is shown that random two-qubit interactions efficiently 
approximate the first and second moments of the Haar 
distribution, thereby constituting approximate 2-designs. 
This is all one needs for decoupling 26, 2T|. It is an in- 
teresting open question what other sets of unitaries have 
this property. 
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In this appendix, we provide more detailed explanations and technical details of our claims. We would like to 
emphasize that from a quantum information theory perspective our proof is appealing in its simplicity - contrary 
to what the length of this appendix may suggest. However, for convenience of the reader we provide background 
material. 

The central idea behind our approach is to think of the time-evolution as a quantum channel. When considering what 
happens to the initial state of the system, we can think of the channel Ts^s{ps) = tr^ [U{t) {pg (g) \ip){tp\E) C/(i)^] 
where \4')e is the intial state of the environment. Similarly, when considering the initial state of the environment, we 
wiU think of the channel Te-^s{pe) = tr^ [U{t) {\(t)){(l)\s ® Pe) C/(i)^] for fixed |0)s. 

We first provide the necessary background material about smooth min- and max-entropies (Section |A]). These 
entropies allow us to state a criterion for which a channel maps almost all its inputs to the same output (see Section 
B I. We proceed to apply this criterion to the question of independence of the initial state of the enviornment (Section 
C), and independence of the inital state of the system (ScctionjDJ). Finally, we provide sufficient conditions for which 
the system stays close to its initial state for all times (Section |E[). 
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Appendix A: Entropy measures and the time needed to change them 

1. Notation 

The dimension of the Hilbert space Ha is denoted by d^. We write tt^ := to denote the fully mixed state on 
system A and |\E')aa' '■— -^j^ Sf=i V) A® \i) A' to denote the maximally entangled state between Ha and "H^', a copy 
of Ha- We introduce the set of normalized density operators 

S={nA) {pA e Herm(HA) : PA > 0,tr pA = 1} (Al) 

as well as the set of subnormalized density operators 

S<{nA) {pA e Herm(HA) : PA > 0,trpA < 1} • (A2) 

For a Hermitian operator O we write Amax(O) to denote its largest eigenvalue. For an arbitrary linear operator Af , 
let := v/Ainax(MtM) and \\M\\^ := tr VmUI. With log we denote the binary logarithm. 

2. Min- and max-entropy 

For pAB € S<{T-Lab) we introduce the min-entropy of A conditioned on B as 

Hmin(^|S)p := sup sup {\eM.:2-HA®(TB>PAB) (A3) 

<TBe5=(-HB) 

and the max-entropy of A conditioned on B as 

H,nax(A|S)p := sup \0g[F{pABjA®(JB)f . (A4) 

For a trivial system B they simplify to YLrnin[A)p = - log A^axlp) and Hmax(^)p = 2 logtr ^/pj. For p^s e S^{1-Lab): 
let H(A|i3)p denote the well-known von Neumann entropy of A conditioned on B. Then we have from [13i Lemma 2 
and Lemma 20] that 

- logdmin < H,ni„(A|B)p < H(A|B)p < H„,ax(A|B)p < log (A5) 

where dmin := min{dyi,dB}. 

An example of the operational significance of Hniin(A|i3)p is that its negative quantifies the maximal number of 
fully entangled bits achievable from pab with local operations restricted to B. Hniax(^|-B)p quantifies, for instance, 
how random A appears (when used to generate a key, for example) from the point of view of an adversary with access 
to B ilOj. 

3. Distance measures 

We call the metric \\p — a\\-^ induced by the 1-norm the trace distance of p, ct G S<{H) and omit the usual factor ^ 
in the definition. This distance measure determines the maximal distinguishability of the states p and a |14j . 

A notion of the similarity of two states is given by the fidelity which generalizes the Hilbert space scalar product 
to mixed states. For p,(J ^ S<:{H) it is defined by 

F{p,a) := WVpV^W, . (A6) 

If one of the states is pure, say p = we have 

Fi\^){i;\,a):^^/WW) . (A7) 
The fidelity can only increase under CPTPM's (e.g. partial traces) pS], i.e. 

FiT{p),na))>Fip,a). (A8) 



Many important properties involving the fidelity can be derived from the following theorem |29) . 



Theorem A.l (Uhlmann's theorem). For p,a € S={H) we have 

max I (V- !</>)! (A9) 

where the maximum is over all purifications of p and \(f>) of a. For a fixed purification \ip) it suffices to maximize 
over all |0) . 

The fidehty and the trace distance are essentially equivalent measures of the distance/similarity of two states 
p, (T G as shown by the Fuchs-van de Graaf inequalities |30j 

l~F{p,a) <^\\p~a\\,< ^l-F{p,nY . (AlO) 

By use of the fidelity we can define a distance measure satisfying many natural conditions. We first indroduce the 
generalized fidelity for subnormalized states p,cr Cz S<{'H) 



F{p,a) + V(l-trp)(l-tra) (All) 

which coincides with the usual fidelity if at least one of the states is normalized. This allows us to define the purified 
distance 



P{p,a)^^l-F{p,aY . (A12) 

For subnormalized states p-,cr ^ S<:{7i) the purified distance satisfies the following properties [5T| : 

• It is a metric. 

• It cannot increase under CPTPM's. 

• It is invariant under extensions and purifications in the sense that for every extension (purification) p of p we 
can find an extension (purification) ct of cr such that P{p,a) = P{p,a). 

We can find a statement similar to the Fuchs-van de Graaf inequalities for the purified distance and for subnormalized 
states. 

Lemma A. 2. For p, ct G S<{H) we have 

1 



^„p-<j\\,<P{p,a)<^2\\p-a\\, . (A13) 



If p. cT E S={H) we have 



^\\p-<jh<P{p,'y)< ^J\\p-<j\\, . (A14) 



Proof. Combining [311 Definition 1 and Lemma 6] we have 



i Hp - all 1 + i |tr p - tr a| < F (p, a) < ^\\p - a\\^ + \tY p - ti a\ . (A15) 

The second statement then follows trivially, the first statement follows with the observation that 

|trp-tra| < ||p-a||i . (A16) 

□ 

By use of the purified distance we are able to define neighbourhoods of mixed states. For p e S<:{H) and e > 
with tr p > we define an e-ball in S<{'H) around p as 

B'{p) := {a e S<iH) : P{p,a) < e} . (A17) 

From the triangle inequality for P we find the following triangle inequality for the e-balls: 

T G 6" (p) A a e B'' (r) ^ (7 e 3"+"' (p) . (A18) 

For more details about the purified distance and e-balls we refer to 31]. 
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4. Smooth entropy measures 



A problem with the conditional min- and max-entropies introduced above is that they are sensitive to small varia- 
tions of the state on which they are defined whereas the physical quantities we are bounding with them generally are 
not. Following an idea first introduced to quantum mechanics in |32j we will therefore use "smooth" versions of these 
entropy measures. |41| Roughly speaking, the smoothing means that states which are highly untypical do not have 
to be taken into account. 

For e > and pab S S<{'Hab) we define the e-smooth min-cntropy of A conditioned on B as 

H^i„(A|B)p:= sup H^i„(^|B)p (A19) 

PABeB'ipAB) 

and the e-smooth max-entropy of A conditioned on B as 

H^_(A|S)p:=_ inf H^ax(A|B)p. (A20) 

Pab'^o^KPab) 

Since all Hilbert spaces in this appendix are finite dimensional, we can and will replace the suprema and infima by 
maxima and minima, respectively. In particular, we will make use of the fact that there is a state in the e-ball which 
achieves the extremal value. Note that H^-ji„(A|i3)p is monotously increasing and W^^^{A\B)p monotously decreasing 
in e. 

The relevance of the smooth entropies is due to the fact that they can be given an operational meaning in one-shot 
scenarios, where e usually plays the role of an error probability. On the other hand, the von Neumann entropy is 
mainly relevant in an i.i.d. scenario. H^g^^(A|i3), for example, quantifies the work cost to erase system A conditioned 
on a memory B, except with a certain probability 

The smooth min- and max-entropy are dual to each other in the sense that if pabc G S<{Habc) is pure we have 

m 

^UM\B)p - -Hf„,,(^|C)p . (A21) 

Furthermore, H^jj^(A|i?)p is invariant under isometrics acting on A or _B, i.e. it does not depend on the Hilbert space 
used to represent the density operator locally. These two properties of the smooth entropy measures crucially depend 
on the choice of P as the relevant distance measure. The smooth entropy measures share natural properties with the 
usual von Neumann entropy like strong subadditivity |3H footnote 7] 

Ill,^{A\BC)p < H^i„(^|i?)p and 

H^,,(A|i?C)p < Hf,aJA|B)p . (A22) 

It can be seen from the Schmidt decomposition that for a pure state |(/>) ((/)|ab the entropies of the marginals on the 
A- and i?-subsystem are identical. This observation generalizes to the case of the smooth entropy measures. 

Lemma A. 3. Let E S={Ha ®'Hb) be a pure state. Then, 

HLi„(^)0 = H^i„(-S)0 and 

HLax(^)0 = H^ax(^)0 ■ (A-23) 

Proof. Since tr^ and tr^ have the same eigenvalues, there is an isometry mapping one to the other. 

The statement then follows directly from the invariance of the smooth entropy measures under isometrics. □ 

While the smooth entropy measures coincide (for e — > 0) with the von Neumann entropy for density operators 
which are proportional to projectors, they are strictly more general. We discover the von Neumann entropy from the 
smooth entropy measures in an i.i.d. [independent and identically distributed) scenario. 

Theorem A. 4 (Fully Quantum Asymptotic Equipartition Property). fUi] Let e > and let pab G S={Ha <E)'Hb)- 
Then, 

lim lim -Hf„i„(A|B)„«„ ==H(A|B)p (A24) 

E-fO n->oo n 

lim lim -H^,JA|i?)p«„ - H(A|B)p . (A25) 

E— >-0 n— >-oo n 
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5. Chain rules 

In order to deal with the introduced smooth entropy measures, chain rules are indispensable. 
Lemma A. 5. f2^ Lemma A. 6.] Let e > 0, e' ,e" > and pabc G S={%abc)- Then, 

\l<^{A\BC)p < H^+„^^'+^" iAB\C)p - Hj:;„(i3|C)p + log | • (A26) 

In the other direction (i.e. in order to lower-bound HJjjjjj(^|_B)p), we will use two chain rules neither of which is 
stronger than the other. Since we will not need it, we omit the conditioning system C 

Lemma A. 6. For any £ > 0, pab G S<{Ha ®'Hb) we have 

HLin(^|S)p > ffmi„(^S)p - logdB . (A27) 

Proof. Choose pab e B%p)ab such that H,ni„(AB)p = Hf„i„(AB)p. From [31 Lemma 3.1.10.] 02] we have 

llmin{A\B)p > Il„un{AB)p - log rank . (A28) 
By definition llf^^^^{A\B) p > Hi„in(^|-S)p and logrankps < \ogdB and hence the assertion. □ 

Lemma A. 7. Let e > and pab £ S^{'Ha ^Hb)- Then, 

5 £ 24 

^l,M\B)p > H^i„(^i?)p - H|,ax(i3)p - 2 • log - . (A29) 

Proof. We introduce the auxiliary quantity 

Hr{A)p :=- snp {X e R : PA > 2^- P°a} (A30) 

where denotes the projector onto supp(pA)- Since Hfi{A)p is the negative logarithm of the smallest non-zero 



eigenvalue of p it is obvious that Hji{A)p > log rank p^i. Using ( A28) we find 

Hmin(^|^)p > H,„i„(ylB)p - logrankps 

> H,„i„(AB)p - Hr{B)p . (A31) 

By the definition of the smooth min-entropy and ( |A31[ ) we have 

Hf„i„(A|B)p> max {ll,^in{AB)p - Hr{B)p} 

> max <nia.x[R^in{AB)usLjnB - HR{B)nBunB]i- ■ (A32) 



i^ab£B2 (pab) 



n 



The maximum maxn^^ ranges over all < < Is such that Hbojab^b ^ (w^^) and hence by use of the triangle 



inequality (A18) Ubi^ab^b G B^{pab)- Using the auxiliary Lemma F.l we find 



W^,^{A\B)p> max H,„i„(AB)^ - min [i/fl(B)n^^n«] ^ . (A33) 

'^ab<^B2(pab) I n« J 

As a next step we choose ujab — ^ab G [pab) such that 'E.^^^{AB)p = }in^in{AB)a,. Hence we get 

Rl,^{A\B)p>Rl,^^{AB)p-mm[HR{B)n^onB] , (A34) 

Us 



F.2 



where now the maximum maxn^ ranges over all < lis < Ib such that IIb<^ab^b G (ujab)- Using Lemma 
we can choose < 11^ < 1^ with Ubi^ab^b G B^ (ujab) such that 

HR{B)nAC.nA < ni^Bh - ^ • log |^ . (A35) 



From this we finally obtain 

Rl,,M\B)p > ^LiAB)p - Hi,(B)i + 2 . log 



24 

> H* - uL.{B)p + 2 • log 1^ . (A36) 

□ 
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6. The times which are necessary for entropy changes 



Since the unitary time evolution of quantum mechanics does not change the eigenvalues of the states on which it 
acts, entropy changes can only occur if we consider one part of a bipartite system and an interactive Hamiltonian 
governing the joint evolution of the system. The part of the bipartite system we are particularly interested in will 
simply be called "the system" S and the other part its "environment" E. We do not make any assumptions about 
these two systems except that their joint Hilbert space can be written as a tensor product space Use = "Hs ® 'He- 
We decompose their joint Hamiltonian Hse into a term acting non-trivially only on the system S, a term acting 
non-trivially only on the environment E and an "interaction term" 

Hse = Hs®lE + h®HE + H,„t ■ (A37) 
He) evolving under a Hamiltonian Hse with interaction strength 



Theorem A. 8. Consider a state PseIO) G S^{'Hs 
1 1 -ffmt I loo- Then for all times t 



dt 



Amax {Ps{t)) 



< 



1 

f 



where 



TiHsE.PsEm := imm{A\\H,„t\\^,\\[HsE,PsEm\\i}r 



(A38) 



(A39) 



It is worth noting that the above bound is symmetric under an exchange of S and E. In particular it does not 
matter whether we want to change the maximal eigenvalue of the larger or the smaller part of the joint system SE. We 
will find the time T to be of fundamental importance when lower-bounding the times which are needed for entropy 
changes. It only depends on the Hamiltonian and the initial state and is diverging if either the system does not 
interact with the environment or if the initial state commutes with the Hamiltonian. In these cases no changes of the 
local eigenvalues and thus the local entropies are possible. In the latter case the initial state does not evolve at all, as 
can be seen by the von Neumann equation. We need a long time to change local entropies if the interactive part of 
the Hamiltonian is weak or if the initial state is close to a mixture of energy eigenstates of the Hamiltonian. Given 
Hse, the decomposition (A37) is not unique. The freedom in the decomposition can be used to optimize the r.h.s. 
of (A38), that is to minimize ||-ffmt| 



Additionally, the bound may be optimized by restricting \\Hi, 



J to those 

eigenvalues of Hint for which the corresponding eigenstates have non- vanishing overlap with the initial state P5e(0). 



Proof. Up to (A43l the proof is due to J4J and reproduced here for completeness. 
Neumann equation we have 



Let a > 0. By use of the von 



dt 



trs {Psit)} - -«trs {apr'W [Hse, Pssit)]} 

a-l, 



= -iatrsE{ipT\t)®^E) [HsE,PsE{t)]} ■ 
Using the cyclic property of the trace we have 

trsE { (Ps"' (*) ^ ^e) [Hs (E) Ie , PSE (t)] } = trsB { [pr\t) (^iE,Hs(g> Ie] Pse {t) } 

= i^s{[pT\t)^Hs] tVEPSEit)} 
= tTs{[tTEPSEit),p'^-\t)]Hs} 

= . 

Furthermore, 

trsE {(Ps"' W ® ^e) [Is ® HE,PsE{t)]} = ti- SE { [pT'i*) ® IbJs ® He] Pssit)} 

= . 

We conclude that 



(A40) 



(A41) 



(A42) 



dt 



trs {Psit)} = -^atrsE{{p^'it) ® ^e) [H^nt, PsE{t)]} 
^-tatrsE{HM [pSE{t),Ps'\t)(»lE]} ■ 



(A43) 



We introduce the notation Pcor{t) '■= psE{t) — Ps{t) ® PE{t) to find 



^ trs = ^iatisE {H,nt [pcoAt), p'^g'^it) ®1e]} ■ 

We bound the absolute value of this derivative by use of the following inequality for bounded operators A, B 

MAB)\<ir\AB\^\\AB\\,<\\A\\,\\B\\^ 
and the triangle inequality which yields 

d 



dt 



trs{pgW} 



< «ll^^«t|lco \\[Pcor{t),pT 

< 2a \\H,^t\\^ \\Pcor{t)\\^ \\pT^it) ® I^ll 



= 2a\\H,^t\L\\p,ormi\\pr\t)\L ■ 
The term ||/Ocor(Olli ^ trace distance and hence upper-bounded by 2. Since a > 1 we have 

In conclusion, 

d 



dt 



<4a-||i/„J^-A„,ax(ps)""' 
<4«-||i/.„t|loo-(trs{pg(i)})' 



Another upper bound for | trg {pg(t)}| can be obtained from (A40). We define Use ■— e ^^sst^ gj^^, 
commutes with the Hamiltonian we have 



dt 



trs {PsW} = -««trsB {(pr'W ® li?) Use [HsE.PsEm Ul^] 
= -laiTsE [uIe {pT\i) ® Is) Use [i?SB, Psb(O)]} 



We have again by ( A45 1 



dt 



trs{pg(t)} 



< a 



UlE{pT^{t)®^E)UsE \\[HsE 

oo 

<O^W\t)\\j\[HsE,PSEm\\l 

= aXn^iix {ps{t))"^^ \\[Hse,Pse{0)]\\i 

<a-{trs{pm})^ -msE^PsEmWi • 

Defining 

f {HsE,PSEm (min{4 \\H,^t\L ^ II [Hse, PsEmWi}) 
the differential equations ( A48 ) and ( |A50[ ) can be combined to 



^tr.KW} 



<j^-{trs{pUt)}y 



By use of Lemma F.3 we obtain that for all times t 



trs{pg(t)}e 

We introduce the Schatten a-Norm 



{tvs{pm}y 



A{trs{p1{0)}) 



1 t 



(trl^r)^/" 
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The time needed to change from trs {Psi^)} to trs {Psif)} is therefore at least 



T- 



(trs {P5«})^ - (trs {Ps(/)})^ = T ■ \\\psm^ \\Ps{f)L\ 



(A55) 



We take the hmit a — >■ oo and use that for ps > we have ||ps|loo ~ -^max (ps) to find a minimal time of T 
|Amax (Psii)) - Amax (ps(/))l to change the maximal eigenvalue from A,nax (ps(*)) to A^ax (Ps(/))- K it were possible 

i 

T 



to have 



J^Aniax (ps(0)| > this bound could be violated for an infinitesimally small change of A,„ax {ps{t)), £^nd 



hence the assertion. 



□ 



In Section O we will particularly interested in how fast Hfjj;jj(S') can decrease from its maximal value logds and 
how fast H^jjji?) can increase if it is initially zero. 

Corollary A. 9. Consider an initial state pssi^) G S=(1-Lse) undergoing an evolution governed by a Hamiltonian 
HsE with T = T {HsE, Pse{0)) as introduced (A39|. Assume that for e ^ we have }i^^^{S) pi^^-^ = log ds- Then for 
t > we have 



HLin(5)pm >-log 



1 



(A56) 



Proof. Maximal initial entropy H^;jj(S')p(o) = logds for e imphes that Amax (ps(0)) = j^- Integrating (A38l we 
find that at time t 



1 



Amax iPs(t)) < J- + 



t 

f 



The assertion then follows from direct application of the definition of the min-cntropy. 



(A57) 



□ 



We are furthermore interested in an upper bound on H^-^i^xi-^) pit) given that /9_b(0) is pure, i.e. that there is an 
eigenvalue one. From Theorem |A.8| we can conclude that after some brief enough time the maximal eigenvalue is still 
close to 1 and hence all other eigenvalues are small. A sufficient smoothing parameter e then allows us to neglect 
all eigenvalues but the maximal one. For such a smoothing parameter, we can even upper-bound 11^^^{E) p(^t) by a 
negative value. 

Corollary A. 10. Consider an initial state pse{0) € S={'H.se) undergoing an evolution governed by a Hamiltonian 
HsE with T = T (HsE, Pse{0)) as introduced in (A39). Consider a pure initial state on E. After a time t we have 

for e > J2f 



Hi 



(£^)p(o<-iog 



1 



l-t/T In 2 T 



(A58) 



Proof. It follows directly from the definition of the purified distance (A12) that a normalized state pE G S={T-Le) has 
a purified distance a/1 — Amax(pE)^ to the subnormalized state which only consists of the eigenvalue Aniax(p_E) and 
the projector onto the corresponding eigenstate. Thus, by definition of the smooth max-entropy 



(i?)p<- log 



Amax(P£;) 



Integrating (A38) we find that if Amax (P£;(0)) = 1 we have Amax {pE{t)) > 1 — |; so 



\/l - Amax {PE{t)f < 



(A59) 



(A60) 



Since a larger smoothing parameter leads to a smaller smooth max-entropy we conclude that 

1 



'^*/^(i?)p<-log 



l-t/T 



(A61) 

□ 



As a corollary of |17) we find the following upper bound for the rate of change of the von Neumann entropy. 
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Corollary A. 11. For psE(t) € S^IUse) evolving under a Hamiltonian (A37l we have 



dt 



H(5)p( 



< 7(^5) • \\Hrnt\ 



(A62) 



where 7(^5) 2 maxi <^<i y/ A(l - A) log 



We have 7(2) « 1.912. The bound is optimal in the sense that it is achievable if logds < logd^. We were not 
able to include the relation between the initial state pse{0) and the Hamiltonian Hg^ in the form of the commutator 
IK^gB, P5e(0)]||-^ into an upper bound on the rate of change of the von Neumann entropy, as it was possible in the 
case of Corollaries I A. 91 and I A. 101 

Proof. In [17] it is shown that for pure a pure states If^)^^; € "Hse the optimal rate with which the local von Neumann 
entropy can be increased (optimized over pure states \4>)se and Hamiltonians Hse) is given by 



dt 



WW 



7(d) • \\Hsi 



(A63) 



where d — min {dg, ds}. For arbitrary \(j))sE and Hse, the rate of increase of H(S') is therefore lower than 7(d) • 
II-^sbIIoo- Multiplying all energy levels by —1 inverts the time-evolution, so an upper bound on the rate with which 
the entropy can increase which only involves ||-ffmt||o^ is also an upper bound on the rate with which it may decrease, 
so 



dt 



H(^)p(t) 



< 7('^) • \\Hse\ 



(A64) 



Following the same steps as in the derivation of ( A43 1 , it can be shown that only the term i?i„t in the decompostion 
(A37) of the Hamiltonian Hse is relevant for changes of the local von Neumann entropy H(S') [34 . Since we can 
always chose a trivial decomposition with Hs — He = the optimal decomposition satisfies 
so that replacing H^^sbHoo by ||-ffmt||oo always allows for an improvement of the bound (A64) 



\H, 



mtlloo < \\HSE\\^, 

Finally, to obtain a 

statement which also holds for non-pure states psE, we may add a purifying system P to SE and formally include 
it into an extended environment E' = EP. The function 7(d) is monotously increasing in d. For, let G l] be 

such that 7(d) = 2^Ad(l - Ad)log (^fz^) • Then, 



7(d) = 2v/A.(l-A,)log(^^j(^) 
< 2v/A,(l-A,)log 
<2 max J\{1 - A) log 

i<A<l VI 



A • d 
A 

= 7(d+l) . 

So since d — minjds, d^'} < dg we have 7(d) < "f{ds) which concludes the proof. 



(A65) 

□ 



Appendix B: Our criterion 
1. Quantum cliannels 

A quantum channel is mathematically a completely positive and trace-preserving mapping (henceforth CPTPM). 
That is, a linear mapping 

Ta^b : End(HA) End(Hs) (Bl) 

which is such that for pA G End('Hy!i) we have trT^-s-s (pa) = tr pA- The requirement of complete positivity requires 
that for pab. G Herm {Ha ^T-Lr) with par > we have {Ta-^b ® 1-r) {par) > for any finite dimension of Hr. Here, 
Tr denotes the identity on Herm (■?{/?,). We will henceforth omit such identities when they appear as tensor factors 
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Par ^ ^b®Pr 

FIG. 2: The ^-part of a bipartite state {Ua ^ Ifl) Par undergoes an evolution described by the channel Ta^b- Will 

this evolution lead, for almost all unitaries Ua, to a state without correlations between B and R? 



and write Ta^b {par) = {Ta^b ®'^r) [par)- For a CPTPM Ta^b, let A' be a copy of A (so "Ha' = T^a)- We then 
define the Choi-Jamiolkowski representation 

TA'B Ta^s (|*)(*Ua') e 5= {Ua'^Ub) ■ (B2) 

Every CPTPM Ta^b can be written as a concatenation of an isometry Va^bb' (applied by conjugation) and a partial 
trace ivs' {Stinespring dilation). Here, B' is in general not a copy of B. 



2. Informal version 



Our whole approach is based on the so-called decoupling technique from Quantum Information Theory [24j . In 
quantum information theory, the decoupling theorem was used to exhibit the existence of encoding and decoding 
schemes allowing us to send information at a certain rate over a quantum channel [21] ■ We will employ it here in a 
rather different manner. 

A very general decoupling theorem which we will use was developed in \12\ I23j . Consider a bipartite state par and 
imagine that the correlations between A and R in this state describe the (classical or quantum) information R has 
about A. The A-part then undergoes an evolution separated from the _R-part, see Figure [2j Specifically, we first apply 
a unitary Ua to the A-part and input the resulting state to a channel Ta^b- The decoupling theorem then provides 
conditions for whether the channel destroys the correlations between A and R, i.e. decouples the output on B from 
R, for almost all unitaries Ua- If this is the case, the channel output tb on B will be independent of the input but 
only depend on the channel itself. This last aspect is exactly what we are interested in here. For simplicity and since 
we do not need it, we will henceforth chose R to be trivial and not include it explicitly. It is straightforward to adapt 
our arguments for non-trivial R. 

The question we want to answer therefore comes down to whether a CPTPM Ta^b is such that most input states 
yield the same output or not. Our criterion for this to happen is based on the comparison of entropies of the state 
TA'B-, the Choi-Jamiolkowski representation of Ta^b- Namely, we predict that if 

ill,,M'B)r - Hf,ax(i3)r ^ . (B3) 

for a small e, almost all input states yield the same output. Conversely, if 

Hf„,JA'i?),-Hf„i„(i3).<0. (B4) 

there are input states which yield a distinct output. The differences ll'^i^{A' B)r — llfj^i^^{B)r and ll^^^{A'B)r — 
Hmin(^)T- coincide (for e — >■ 0) for some simple channels, yielding a tight criterion. Three simple examples are given 
in Table U 



Description of mapping 


Ta^b 


Hmin(^'S)r — Hmax(S)r, 'Siaa.^{A' B)r — H^in(_B)T- 


Erasure of A 


OA ^ \0){0\a 


logdA 


Orthogonal measurement on A 


OA i-> Sfcli \k){k\Ao\k){k\A 





Identity on A 


OA ^ OA 


- log dA 



TABLE I: Entropic quantities specifying the mapping Ta^b in the case e — ^ 0. In all of the above examples we have B = A. 
{\k)A})^^i a is an orthonormal basis of "Ha- 



As a further example for llf-^^-^{A' B)r — ll'^i^y.{B)r or }i'^^^{A' B)r — ll^-^{B)r, consider a system A consisting of 
m + n qubits and the mapping Ta^b which is just the partial trace over n qubits, leaving the remaining m qubits 
which form system B untouched. Then 

H^iJA'S), - ff„_(i3). = ^l^JA'B), ~ Hf„i„(i?), =n-m (B5) 

for small s. The more we trace out and the less we leave untouched the more similar are the channel outputs of 
different inputs. We recover the identity and erasure in Table [l] as special cases. 
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3. Main tools 



We are now able to state our two main theorems for whether a channel is such that different input states yield the 
same output or not. 

Theorem B.l. Let e > 0, pA S={'Ha) md Ta^b be o, CPTPM with Choi-Jamiolkowski representation ta'b- 
Then, 

Pj{\\Ta^b{UpaU^)-Ta^b (^a)IIi > 2-5HL„(^).-^Hf_(A'|B). ^ ^28 + 5} < 2e-^-^Vi6 (b6) 

where the probability is computed over the choice ofU from the Haar measure on V{A), the group of unitaries on Ha- 
Proof. This is a corollary of [22l[23j. [23l Theorem 3.1] with a trivial system R tells us that 

/ \\T{UpaU^) - TsWidU < 2-5Hr„.„(^)p-^H.W(A'|i3). ^ ^2£ (B7) 

Jv{A) 

where the integral is over the Haar measure on U{A) . Note that the state tb , the partial trace of the Choi-Jamiolkowski 
representation of the mapping Ta^b, is the state obtained from applying the mapping to the uniform input, 

Tb — tr^' TA'B 

^iVA'TA^B{\'^){MA'A) 
= Ta^B (trA' \^)mA'A) 

= Ta^b{tta) . (B8) 

The measure concentration properties of the Haar measure allow us to translate such a statement about a small 
average into a statement about an exponentially small probability of a large outcome. For a function f : A ^ B from 
a set ^ to a set B endowed with distance measures d^ and djs we define the Lipschitz constant 

Lif):= sup ^M^MlZM). (B9) 

«1 ,0-2 

From [3ni Corollary 4.4.28] we have the following lemma. For a function / : U(C'') — ^ C let {f)^ be the Haar measure 
average of /. Then, 

PjUfiU) - {f)u\ >S}< 2e''''/*^^f>' (BIO) 

where the probability is computed for the choice of U from the Haar measure and where the relevant distance measure 
on U(C'') is ||. . .II2. This is a generalization of the more well-known Levy's Lemma and can in contrast to the latter 
also be applied to mixed states. It is shown in the proof of 22^, Theorem 3.9.] that the Lipschitz constant of the 
function 

fiU) = \\TA^BiUpAU^) - Ta^b (7rA)||i (Bll) 

is upper-bounded by 



2max{|ir(X)|l, : X G Herm(?^A), < 1} • ^\\pa\L ■ 

Since pA G S={'Ha) we have ^||pa||^ < 1. Any X e Hcrm('Hyi) can be written as X = Pi — P2 with Pi,P2 G 
Herm('Hy!t), P11P2 > 0. Since T is trace-preserving and positive (i.e. maps positive operators to positive operators) 
we have 

lir(x)||,<lir(Pi)|ii + iir(P2)|ii 

= tr [r(Pi)]+tr [T{P2)] 
= tr Pi + tr P2 

= \\X\\,, (B12) 
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max{\\T{X)\\^ : X e HcrmCHA), |l^|li < 1} < 1 



(B13) 



so 



and the Lipschitz constant of / is upper-bounded by 2. Applying (BIO) tells us that 

Pj{\f{U)-{f)^\>S}<2e-''-'^-/'' 

{f{U) > 2-5H-"(^)^-5H^in(^'|s). +12e + (5} < Pr {/([/) > {1}^ + ^} 

<Pj{\f{U)- 



(B14) 



Pr 

u 



<Pj{\f{U)-{f)^\>S} 



where the first inequality is due to (B7). 



(B15) 

□ 



Theorem B.2. Let pA G S={'Ha) o,nd Ta^b be a CPTPM with Choi-Jamiolkowski representation ta'b- For any 
e' > and e" , e'" > 0, suppose that 



e'+2e"+£"' + yi 



Then there is no state ujb ^ S={'Hb) such that 



U(A) 



\T{UpAU^)-u:B\\^dU < \ . 



(B16) 



(BIT) 



The integral is with respect to the Haar measure. 

Proof. The proof consists of two parts. First we show that 



V{A) 



\TiUpAU^)-nnA)\\^dU>s . 



(B18) 



the proof of which is a formalization of [23l footnote 7] . Then we show that if this is true the integral cannot be small 
for any state lob. 

We apply Lemma |F.4| where we think of R as being a classical register which holds the randomly chosen unitary U 
(the dimension of R is |U(A)|, the cardinality of U(A), which is infinite). The input state is given by 



PAR := 



U(A) 



UpaU^ <E>\U){U\RdU 



(B19) 



Since pA — Ju(^a) UpAU^dU = tta we have ta'b — dA\fpATA' B\fpA = ta'b- From Lemma 



F.4 



we have that 



e< \\r{:PAB)-r{pA)®~pR\\^ 

T{UpaU^) ® \U){U\RdU - TiiTA) ' 



U(A) 



V{A) 



\U){U\RdU 



ViA) 



{T(JJpaU^) - T{tta)] ® \U){U\RdU 



V{A) 



V(A) 



\{T{UpaU^) - Ti^A)] ® \U){U\R\\^dU 
\T{UpAU^)-T{^A)\\^dU . 



(B20) 



The third equality is due to the fact that all operators in the integral act on mutually orthogonal states due to the 
i?- fact or. 
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Now, assume by contradiction that there is a state lob € S={Hb) such that 



V{A) 



\TiUpAU^)-iOB\\^dU 



< 



(B21) 



Then, by use of the triangle inequaUty, 



e 

- > 
2 - 



WnUpAU^) - T{7TA)\\^dU - \\T{7Ta) ~lob\\, 

IV{A) 

> e - \\T{nA) - wslli . 
Furthermore, by use of the convexity of the trace distance, 



(B22) 



\r{upAU^)-iOB\\^du 



> 



r( / upauUu) - ujB 

V{A) " Jv{A) 

= \\T{TTA)-i^B\\i ■ 

The first equahty is due to Lemma F.5 Combining inequahties ( |B22 1 and (B23l yields 



U(A) 



\T[UpAU^)-iOB\\^dU 



> 



in contradiction to (B21) 



(B23) 

(B24) 

□ 



Discussion 



We see from Theorem 



B.l 



that almost all inputs of the form UpaU^ for a. pA E S={T-La) and a randomly chosen 
U e U(^) yield the same output Ta^b{t^a) if the the term 

2-|Hr„„(A),-iH.^,„(A'|i3). + 12£ + 5 (B25) 



as well as the probability 2e "^^^ in ( B6 1 are small. If the additive terms 12e and 5 are small, ( B25 1 is exponentially 
small in W^:^^{A)p + Wj^:^^{A'\B)r- For large dA we can make both 5 as well as the probability 2e~'*^'' small by 

— 1/3 

chosing 5 = dj^ . We therefore express the condition for almost all channel inputs with the same eigenvalues like pA 
yielding the same output slightly informally as 

HLin(^)p 



(B26) 



A.7 



Applying Lemma 
may also express this 



neglecting different smoothing parameters and additive correction terms of order O (log ^) we 
condition as 



(B27) 



Similarly, according to Theorem B.2 different input states with the same eigenvalues like pA do not yield the same 
output if 



(B28) 



Our criterion for whether different input states yield the same output or not is therefore tight up to correction terms of 
order O (log and differences between smooth min- and max-entropies. We emphasize, however, that this difference 
may become arbitrarily large. Since we are usually interested in the limit of small, positive epsilons, it may at first 
sight seem disturbing that the correction terms of order O (log ^) diverge in this limit. We keep in mind, however, 
that the divergence is only logarithmic and that the epsilons do not depend on the size of the systems. The entropic 
terms, on the other hand, grow proportionally with the size of the system. In the thermodynamic limit the logarithmic 
divergence is therefore negligible. Since ^%^in{A)p > we may also formulate the following condition which does not 
depend on the input state any longer but only on the channel Ta^b- If 

HLin(^'S)r ^ HLMr (B29) 

almost all input states (pure or mixed) yield the same output. Conversely, if 

H^,,(A'S). < UlMr (B30) 

there are states in S={Ha) (namely those with a small H^j;jj(yl)p, i.e. states which are close to pure) which yield a 
distinct output. 
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5. In the i.i.d. scenario 



We saw in the previous section that in the condition for whether different input states yield the same output or 
not there is a gap which is due to differences between smooth min- and max-entropies. In an i.i.d. scenario the gap 
between smooth min- and max-entropies can be closed according to Theorem | A . 4| which allows to replace the smooth 
entropies by the von Neumann entropy. A little more work is required to rigorously prove the following theorem. It 
states that if we apply an i.i.d. channel to a pure state of a tensor product space H®", then for large enough n 

the sign of 11{A'\B)t with ta'b the Choi-Jamiolkowski representation of Ta~>b provides a tight criterion for whether 
different pure input states yield the same output. For large enough n, if }i{A'\B)r > almost all states |(^)a" G " 
yield the same output. This is not the case if II(yl'|i?)^ < 0. 

Theorem B.3. LcITa^b be a CPTPM with Choi-Jamiolkowski representation ta'b- Then 

Pr |||r^" (|</')(^U") - > exp ' ^ ' (H(A'|i?). " ^)) + ^ + ^a"^'} ^ ^cxp (-d^J'/w) (B31) 

where the probability is computed over the choice of from the Haar measure on 'H'^ . The constant c is 



ceO\J'^-lo,dA 



(B32) 



Conversely, Ve,£' > 3iV such that VneN, n> N ifll{A'\B)r < -e then $ujb^ & such that 



(||r«"(|0)(0U.)-a.B"||i) 



(B33) 



where the average is computed over the Haar measure on "H^" . 



Proof. Theorem B.l gives for the channel Ta^_^b^ i\ia.i 
Pr 



'1- {\\rt%B^m{<P\A< 



B II 1 — 



HL„(A'"|B").®" + 12e + j| < 2 exp (-dJ(5Vl6) • (B34) 



For the exponent we find with Lemma F.6 and (A5) 
n f 1 



;Hfni„(^'"|S")^«„ 



log • lo: 



< - i -H(A'|i?), + ^ • 4 Jlog f ^ ) • log ( 2^dA + 1 



(B35) 



Defining e = and S — d^^'^ yields the first assertion 



The condition of the converse, Theorem B.2 requires for pure input states \(^){(^\a^ 

2 



He -\-1e" ^e'" + s/e I iti\ 
min \^ i|0>(0L 



loe 



Hf„in(-S")r»" < . 



(B36) 
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The l.h.s. of this condition is by use of Lemma F.7 Lemma [F.8[ Lemma F.6 and ( |A5| 

2 



^min J 10) (01. 



< ri < — log 



< n < — log 



+ll{A'B)r 



1 


1 - 


(e' + 2e" + e'" 4 
1 




1 - 


(e' + 2£" + 4 




r + 




•log( 




1 





log "^2 + HLax(^'"-S")r«" - H^in(-S")T« 



n 



lAUB 



1 



Hmin('^")r®" 



< n < — log 

1 " 1 - (e' + 2e" + e'" + ^/e) 



2 ' -l°g;72 



H(A'B), + -L . 4^105 • log (^VdAdB + 2 

1^ . J log ■ log (2-5H™(s). + 2*H...x(B). ^ ^ 



1 , 1 

< n < — log 

[n 1 - (e' + 2£" + £"' + Vi) 



^1 2 
2 ' -^0S-;2 



+R{A'B)r + ^ • 4Wlog ^ • log (^d^ + 2 



-H(B), 



• 4^ loe 



log y/ds + 1 



^^{R{A'\B)r+6{n)} 



(B37) 



Hence if H(A'|i3)T- < S{n) condition (B36) is fulfilled. Let e' — e" = e'" = ^- £ may take any value for which the 
term log — ra- is still well-defined. So for large enough n, e may take any value of the form 1 — 2^' with 

l-(e'+2e"+e"'-l-y/ej 

6' > 0. The quantity S{n) becomes arbitrarily small for large enough n. So after relabling S{n) i— > e and 5' i— e' the 
assertion follows from Theorem IB. 21 □ 



Appendix C: Independence of the initial state of the environment 



1. Essentially tight version 

We now apply the tools developed in the previous sections to the physcial scenarios we are interested in. Consider 
a state which at t = is of product form 10) s ^ \tp)E- [13] We assume that the initial state of the environment is 
subject to some kind of constraint. Physically, we may assume for instance that we know that the temperature of 
the initial state of E lies in some narrow interval. Mathematically, we describe this by restricting the initial state to 
a subspace Hqe '^'^e- We let the initial product state evolve and ask ourselves after a time t whether the state of 
the system ps{t) depends on the precise initial state of the environment \4')e (or only the space H^e)- We provide 
sufficient and necessary conditions for this to be the case. This criterion is based on the comparison of local entropies 
of some particular state TsE{t)- Understanding how the local entropies of one particular state change therefore allows 
us to make predictions which hold for almost all initial states from UriE ■ 

Theorem C.l. Consider an initial product state 

\<I>)s®\'4^)e (CI) 

with \^) e G T~LnE ^ T^e- Let pg (t) denote the state of the system at time t when the environment was initially in the 
state \iP){4'\e, so 

p|(t)=tr^[e-^^--*(|0)(0|s$5|V')(V'|B)e+'^^-*] , (C2) 
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and let 

TSE{t) = e 

Then for all times t and for any e > we have that 



Pr 

We 



Pt {t)-rs{t) 



< e 



,/16 



(C3) 



(C4) 



The probability is computed over the choice of an initial environment state \i^)E from the Haar measure on 



Conversely, if for any e' > and e" , e'" > we have 



1 



HLin('5')r -H^ax(-E^)r -log- — n 

!-(£' + 2e" + £"' + Ve) 



log > 



then there is no state ujs G S={'Hs) such that 



< 



e 



where the average 



We 2 ' 

W>-B computed over the Haar measure on Hn^. 



(C5) 



(C6) 



Note that the logarithmic correction terms in the above theorem do (unlike the entropic quantities) not scale with 
the size of S or E and therefore are negligible in a thermodynamic limit. If at any time t the entropy H^j^^(i?)7- in 
(C4) is sufficiently larger than Hmax(S')T, all but exponentially few initial states G 'Hqe yield the same evolved 
state Ps{t). On the other hand, if H^;jj(S')i- in (C5) is sufficiently larger than llf^^^{E)r, different initial environment 
states \tp) E do not yield the same time-evolved state of S. We have a criterion which is basically tight up to differences 
between smooth min- and max-entropies. The strength of this criterion is based on the fact that evaluating it for the 
single state Tssit) allows to make predictions which hold for almost all initial states from "Ha^. 



2. Simple version for large environments 



The entropy in SE of the state TgE{t) is (modulo smoothing) for all times given by logc?o^. It seems clear that if 
log is sufficiently larger than log ds (which we assume to be the case in a physical scenario) , the entropy in S can 
never become larger than the entropy in E. Hence by Theorem C.l the state of the system will for all times be the 
same for almost all initial states from "Hob • This intuition is made precise in the following theorem. 



Theorem C.2. Consider the setting of Theorem \ C. 1\ Then for all times t we have that 



Pr 

We 



Pt{t)~rs{t) 




< e 



jl/3 



/16 



(C7) 



where the probability is computed over the choice of an initial environment state \'>p) e from the Haar measure on T^n^ ■ 

So all initial states \tp)E G Hqe: except a fraction which is exponentially small, yield at any time evolved states of 
S which are in the inside of a ball with a radius — in trace distance. In ITl Section IV. Al a result almost identical 

to Theorem |C.2 is shown. Namely the states ps{t) and Ts{t) are replaced by their temporal averages. Our result is 
more general since it makes a statement about the state of S at any time. 

The dimensions of the Hilbert spaces grow exponentially in the number of constituent particles. The probability 

e '^"e'^^^ is therefore vanishingly small in a physical situation, where even a restricted environment He has a large 
number of degrees of freedom. Note, however, that the exponentially small fraction of environment states which may 
lead to a distinct evolved state of the system is not just a technicality. As an example, consider Hue to describe the 
space of initial environment states at room temperature and \4>)s to be a hot cup of coffee. Most initial states from 
'wiii just cool down the coffee, so Ts{t) is the cooling cup of coffee. However, if there is a ticking bomb in the 
environment, this will lead to a distinct evolution of S. 
Let us now proof these two theorems. 



22 



Proof. We apply Theorem B.l for the channel describing the dependence |V')(V'I-E Ps (^) which is given by 

pt it) - 71^,^5(1^') O^Ie) tTE [U{t) mi^ls ^ U{t)^] (C8) 



where U{t) — e ^^sEt ^ Theorem B.l then predicts 



Pr 



(C9) 



where tssi^ is the Choi-Jamiolkowski representation of Tqe^s and where the probability is computed over the choice 
of from the Haar measure on U(ri£;). We have H^jj,j(f2£;)|^)^^| > 0. We define 



which purifies the Choi-Jamiolkowski representation Tg^^ of the channel The^S- 
Proof of Theorem C.l , By use of Lemma A. 7 and Lemma A. 3 we find 

Yll,,,MS)r > HL(^^£^)r - HLx(5)r - 2 • log 



(CIO) 
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HL„(^)r - H|,ax(5), - 2 • log 
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(Cll) 



We find 



Pr {\\rnE^s{UnM{MEUlj - TsW, > 2^^^^^'^^'^<^^'^^^^^ + Us + s} < 2e-'^'^''^'' . (C12) 

We chose 6 — ^^n]/^- Picking a unitary from the Haar measure and inputing the state C^n^ IV') (i/'IfiC^n^ into 
the channel Tn^^S '^^ equivalent with picking the input state \i}})e from the Haar measure on T-Ln^, and hence the 
assertion. 

The converse assertion follows directly from applying Theorem B.2 to the channel (C8). We make use of the fact 
that by Lemma [F.7| we have 



1 



1 - (e' + 2e" + £"' + Vi) 



and by Lemma A. 3 we have 

Proof of Theorem IC.21 Note that for all times t we have 



(C13) 

(C14) 
(C15) 



By use of Lemma A. 6 Lemma A. 3 and the strong subadditivity (A22| we find 

K,nA^'E\S)r > Kun{^'ES)r ~ logds 
= Ill,,iE)r ~ log ds 
>Ill;,{E\S)r -logds 

>Ill,,,{SE)r -2log ds 
> log dn^ - 2 log ds . 



We find 



Pr <^ ||ro,^5(f/oJV^)(V'bC/L)-rs||i > 



ds 



(C16) 



(C17) 



We chose e — > and S = d^-^J^ . Picking a unitary from the Haar measure and inputing the state C/jie \'4^){'^\eUIi^ 
into the channel Tue^s is equivalent with picking the input state \^)e from the Haar measure on T-Luei ^-nd hence 
the assertion. □ 
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Appendix D: Independence of the initial state of the system 
1. Main result 

We consider again the initial state \<j>)s <8) IV-') -Ei but are now interested in whether the evolved state of the system, 
ps{t), still depends on its own initial state \<j)){(j)\s- This will obviously be the case for some short enough time. In 
much the same way as we did in the last section for the initial state of E, we now assume that the initial state of 
S follows certain "macroscopic constraints", formally |(/>)s G "Hns ^ '^S- We are interested in whether at any given 



time t the state ps(i) is the same for almost all initial states from Hns 
such that comparing its local entropies allows to answer this problem. 
Formally, we have the following theorem. 



Again, we will find a state TsE{t) which is 



Theorem D.l. Consider a pure initial product state | </>) s i8) | V') -E jointly evolving under a Hamiltonian Hse- Assume 
that \(f)) s is an element of a subspace \(j>)s & "^Hs ^ "^S- Psi^) denote the evolved state of S given that ps{0) — 
\(f)){(f)\s. Let TsE{t) denote the evolved state of the initial state TssiO) — T^ns '8* |'0)('0l-E- Then, for every e > 0, 5 > 



Pr 

I0>s 



pm-rs{t) 



1 H 2 

> 25"° 



(Dl) 



where the probability is computed over the choice of \4')s from the Haar measure on T-Lus- Conversely, as long as for 
any e' > and e" , e'" > we have 



HLin('5')r(t) - H^ax(£^)r(t) " log 



1 - (e' + 2e" + e'" - ^e) 



log ■ 



> 



there is no state ujs G S^(Jis) such that 



< 



1/ 



(D2) 



(D3) 



where the average (. . is computed over the Haar measure on Hrts- 

At < = the entropy in S of the state TgEit) is (modulo smoothing) given by logdog while the entropy in E is 
zero. From (D2) we can see that as long as the entropy in S is larger than the entropy in E, it is not (yet) the case 
that different initial states of have evolved to the same state. If, on the other hand, at any time t the entropy 
in E has become larger than the entropy in S, almost all initial states from Tins will have evolved close to Ts{t). 

Note that we cannot obtain a strong achievability statement if da^ is very small, because then there is no choice of 



S such that both the additive term S in (Dl) as well as the probability 2e "^"s^ 1^^ are small. This is due to the fact 



that the measure concentration properties of the Haar measure only give strong results in high-dimensional spaces. 
If the restricted environment is dimension- wise very small, we can still conclude from (B7) that 



P%{^~rs{t) 



< 25' 



„(-E)x(t)+log ] 



Vie 



(D4) 



This statement about a small averaged distance can then by a Markov-type argument be translated into a statement 
about a small probability for a large distance, but we do not obtain an exponentially strong statement like (Dl I. 



We usually think of the environment to consist of much more particles than the system and hence to also be 
dimension-wise much larger. If it is, on the other hand, the case that the environment E is dimension- wise sufficiently 
smaller than the restricted system fig, it will not have enough degrees of freedom to "absorb" all the information 
about the initial state of the system; hence the system will for all times retain some memory about almost any possible 
initial state. This inuition can be made rigorous by estimating the entropic terms in (D2), namely we have by use of 
the strong subbaditivity (A22), Lemma ( |A.6 ) and the definitions of the smooth entropy measures that 



HLin('S')T(t) - 



^(£^)r(t) > H^'i'n(5')^(t) - logde 

> H^"i'„('5'|i^)r(t) -logdij 

> H^;;'„(5£;),(t) -21ogdB 

> logdos - 2 log (is . 



(D5) 
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Proof. Let U{t) := e ^^sEt_ ^^^.g interested in the channel 

Tns^S ■■ Ps(0) ^ PsW - trs [U{t) (p5(0) ® c/Wl ■ 

From Thcorcm lB.il we have that 

Vv{\\r{UpsU^)-Ts\l > 2-5HL.„(s)p-^H^.„(o^|s). ^12e + 4 < 2e-^-sSVi^ 

Us '~ J 

where tsq^ is the Choi-Jamiolkowski representation of the channel T which is purified by the state 



(D6) 
(D7) 

(D8) 



and Ts — Tns-fS i'^ns)- By use of Lemmas A. 3 and A. 7 we have 



HLi„(f^sl^). > H^.i„(^^^5)r - mMS)r - 2 • log 
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KuniE), " Haax(5). - 2 • log 
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which proves the first part of the theorem. The second part of the theorem follows directly from Theorem 
applying that again by Lemma |A.3| 

HLax(^S'S')r = H^ax(^)r • 

The relevant state for the entropic quantities is therefore 

rsE{t)^U{t) (7ros®|0)(0|£)C/(i)t . 



(D9) 
|R2]and 

(DIO) 

(Dll) 

□ 



Time-scales 



The criteria developed so far do not make any "physical" assumptions whatsoever. We do, for instance, not make 
any kind of locality assumptions on Hse nor do we consider a weak-coupling limit between S and E. If we want to 
bound thermalization times with our criteria, we have to know how fast the different entropy measures can be changed. 
This of course crucially depends on these kinds of physical assumptions. Here, we want to derive a bound on the 
time which is needed for different initial states of S to evolve to the same state which even holds in the most general 
scenario. Let us stress, however, that we expect much better bounds to hold if we do make "physical" assumptions 
like locality of the interactions. 

We saw in the previous section that comparing the local entropies of the state 

rsEit) = e-^^--* {nns ^ \^){^\e) e'"'-' . (D12) 

determines whether generic initial states of fi^ have on average already evolved close to some specific state or not. 
It is not the case as long as (D2) is fulfilled. In Section A 6 we discussed how fast different entropy measures can be 
changed. We apply these results in order to investigate how long we can guarantee that (D2) is fulfilled and thus 
how long we can guarantee that different initial states have on average not yet evolved to the inside of a ball with a 
certain radius in trace distance. The times we obtain this way only depend on the radius of the ball in trace distance 
e, the Hamiltonian Hse and the initial state of the environment \'4')e- 

With Corollaries A. 9 and A. 10 we have the tools at hand to bound the entropic terms in (|D2|). The relevant initial 
state we are interested in when applying these corollaries is tse{0) = ttq^ <^ \iP){'iI}\e. 
fulfilled as long as 



We find that (D2| is at least 



-log 



1 

das 



log 



1 



1-t/T 



log- 



1 



log ■ 



> 



(D13) 



1 



Since the smoothing term e is related to the radius in trace distance wc arc interested in, we are left with one 
free parameter e' which may be chosen so as to optimize the condition (D13). Solving (D13) for the time for which 



equality is obtained for the first time, wc find that for times below c(e, d^sj 
that 



• T there is no state us G S^^Hs) such 



{\\pt{t)-^) 



< e 



1/ I0>S 



Here, T = T (Use, t^Qs I^)(V'I-e) as introduced in ( A39) and c(e, dfjs) solely depends on e and d^^- 
of e and values for dn^ which are not too small, c(e, d^s) is of order 10~^. 



(D14) 



For small values 
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a. Achievability 



We will now show by use of a toy example that it is indeed possible that almost all initial states of Hus evolve to 



the same state in times of order T. We chose Hus = T~Ls, so Tssit) — 
is it possible to achieve 



-iHs 



{tts (E) \tP){^P\e) 6+*-"^-^*. In which time 



it) 



(D15) 



which by Theorem D.l allows to predict that almost all pure initial states on 'Hs have evolved to Ts{t)7 To this end, 



we analyze a particular combination of a Hamiltonian Hse and initial state of the environment 10) (^Is. 

As we have seen in Theorem D.l the initial state tse{0) = tts <8> |1)(1|£; determines widely the behavior of generic 
initial states of the form \(j>){<j)\s <E) For this initial state we consider an environment which is a copy of the 

system and an evolution governed by the Hamiltonian 



SE 



ds 

E 

1=2 



(D16) 



with 



|l>.-^|l>s< 



We 



(D17) 



A straightforward calculation shows that the effect of this Hamiltonian is to interchange the contents of S and E in 
time TT with the exception of the first coefficient. Formally, 



rsit) 



1 — cos{t) 



1 + cos(t) 



(D18) 



and 



TEit) 



1 + cos(t) 



1 — cos(f) 



-TTE 



(D19) 



The condition for initial state independence is that the maximal trace distance 
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iH2,,(S)^(t)-iH2,„(iJ),(t)+Iog 



12e + 6 



(D20) 



as well as the probability 2e"''^'' in ([pTj) be small. We evaluate (|D20[) at t = tt. According to (D18) and (D19l we 



hav e ts{t t) = \l){l\s and TEin) = tte- Hence Hj^nin(-^)r(7r) > logd_E = logd^ and H| 
for (|D20|) 



^(^S)r{Tr) < 0. We find therefore 



22Hn 



-5Hl„(£;),(^)+logf| 



12e 



5< 
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12e + S . 



(D21) 



For £ 



dg^^^ and (5 



dg^^'^ this becomes 36dg^^^- 



-1/3 



which becomes arbitrarily small for large enough ds- With 



this choice of S, also the probability 2e~'^^^ becomes exponentially small in ds- Hence for large enough ds it is 
with the given Hamiltonian possible that all but exponentially few pure initial states of S evolve to the same state in 
time TT. In order to compare this with our lower bound c(e,(ios) ' -^i need no know what a time tt in multiples of 
T is, that is, we have to calculate T for the above combination of initial state and Hamiltonian. Our result becomes 
the stronger, the larger T is. Correspondingly, we have to find the decomposition 



Hse = Hs(E)Ie+ Is ®He+ H,nt 



(D22) 



of Hse which makes \\Hi, 



as small as possible. A trivial decomposition with Hs 



ll^^mtlloo ~ ^- '^^'^ slightly improve this by chosing Hs = \^s and He = which yields \\H. 
calculation shows that 



and He = yields 
= i. A brief 



int 



[HsE,TSEm = 



2d. 



ds 

E 

1=2 



zl)(lz|-|l*)(*l| 



(D23) 
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The operator X]i=2 ~ has (ds — 1)^ + 1 eigenstates \ij) with i ^ 1 ^ j or z = 1 = j which yield a zero 

eigenvalue. Their orthogonal complement has dimension 2{ds — !)■ The states \ll) ± with 2 < I < ds form an 
orthonormal basis for this orthogonal complement and yield eigenvalues zti. So 



E 

i=2 



and 



In conclusion we have 



I [HsE,TSE 



= 2{ds - 1) 



ds-l 



ds 



T= (min{4||i?„ 



SE,TSE 



ds 



ds-l 



(D24) 



(D25) 



(D26) 



ds-l 



for this combination of Hamiltonian and initial state. The time tt expressed in multiples of T is therefore tt - — ^ 
so it is indeed possible that almost all initial states of S evolve to the same state in times of order T. 



3. I.i.d. interactions 



We saw in Section |B 5| that there is a tight criterion for whether different pure input states of an i.i.d. channel 
T^J^g with large n yield the same output or not. This criterion is given by the sign of the von Neumann entropy 
}i{A'\B)T of the Choi-Jamiolkowski representation ta'b of the channel. Such an i.i.d. channel T^^g has a direct 
physical interpretation. For instance consider the input space to be a quantum data storage and consider n to 
be sufficiently large. Every factor is then subject to the same interaction with its local environmet, i.e. the channel is 

Ts^s{(Ts) ■■= tTE [e-'^s-* {as ® pe) e+'^--*] (D27) 

By the n-fold tensor product we model the i.i.d. characteristics of noise. At t = we have H(S"|S')^(g) = — logds. 



According to Theorem B.3 as long as 'R{S'\S)r{t) < different initial states of the storage Tif"' have not yet evolved 
to the inside of a ball with radius ^ in trace distance. The storage still "remembers" its initial state. The storage has 
been erased by the noise if at any time }i{S'\S)r{t) > 0. Almost all (pure) initial states of the product space V.'s"' will 
then have evolved to the same state rs(t)'*". We note that once this has happened, all entanglement between the 
different ^-factors has been destroyed. 

Our criterion holds for any state E may initially be in and for any interaction between S and E. The only assumption 
we make is that the interaction between S and E may be described as i.i.d. on the length-scale of S. If there are still 
correlations in the system's interaction with the environment on the length-scale of one copy of S we may group a 
number of factors S together which is large enough such that the correlations in the interaction decay over the new 
enhanced length-scale. This obviously reduces n, the number of factors, correspondingly. 

If logds > 4 logds the noise can never fully erase the storage. The noise does in this case not have enough degrees 
of freedom, to "absorb" all information stored initially in the storage. To see this, let \(j)){(j)\EP be a purification of 
Pe, so 

TS'SEP ■■=e-'"^^'m{^\s's<E>\4>){cl>\Ep)e+'"'^' (D28) 
is a purification of ts's- For all times t we have Il{SEP)r — logds- Then, 

H(S"|S')^ ll{EP)r-li{S)r 

< Il{EP)r - ll{S\EP)r 
= 2}i{EP)r - iliSEP)r 

< 4 log d_B - log ds 

< . (D29) 

If we again do not impose any physical restrictions on Hse, we know from Corollary | A. 11| that the von Neumann 
entropy can be changed with a maximal rate which is proportional to ||Si„t||^. We can apply this corollary to 
investigate how long we can guarantee that li{S'\S)r{t) < 0. This leads to a trivial lower bound on the erasure time 
which is given by the inverse of the interaction strength ||7Jmt||Q^ times some factor of order 1 which depends on ds- 
With the same toy model as in the previous subsection we can see that up to a small constant factor this is also 
achievable. 
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Appendix E: Absence of thermalization 
1. Previous results 



In this section, we are interested in conditions under which a system wiU (for most initial states of the environment) 
not become independent of some particular initial state. It is shown in that the time-averaged state (ps(^))( of the 
system is independent of its initial state |(/))((/)|s if the energy eigenstates of Hse are sufficiently entangled. In [21J a 
converse statement is proven. Assume that there is a basis oiHs which is such that for all energy eigenstates 

\Ek) e Tis "He the partial trace tr^ \Ek){Ei;\ is close to one of the basis states \i){i\s- This implies that all energy 
eigenstates are close to product and thus are poorly entangled. Consider two pure initial product states whose S'-part 
is one of the basis elements. Then the distance of their time-averages on S is unlikely to be much smaller than the 
initial distance of their S'-parts. Formally, we have the following theorem. 

Theorem E.l (|2T] Theorems 1 and 2). Consider two pure initial product states l^^*-')^^; — \<t>^^^) s'S^l^'^^'^) e, i G {1,2} 
evolving under a nan- degenerate Hamiltonian Hse, yielding temporal averages 1^5^. We define the quantity 



(El) 



which is small if all energy eigenstates either are close to orthogonal to \4>^^^){4'^^'^\sE or locally resemble |^^*-')((/)^'^|5' 
(which implies that they are poorly entangled). Then, 



(1) 



(2) 



> 



- |0(2))(0(2)|s - i?(|<^W)(0(^'|) - i?(|0(^)))(0(2))| 



(E2) 



The quantity R{\(f>'^^'^){(j)'-^^) is small for almost all i/|<^W)s 

is an element of a basis {\l}s}iZi of Us which 
is such that the partial trace of each energy eigenstate is close to one of the basis elements. Formally, 



where 



6 := maxminjltrs \Ek){Ek\ 

k I 



<Sds 



(E3) 



(E4) 



This disproves the long-held conjecture that all non-integrable systems thermalize (since they do not necessarily 
become independent of their initial state). We refer to for different definitions of integrability. A possible definition 
is to require the existence of ds mutually commuting and linearly independent conserved operators on 5. 

By use of our results about independence of the initial state of the environmet in SectionjC] we are able to extend this 
result. Instead of making a statement about the temporal average of S we make a statement about the time-evolved 
state itself. We do not only prove that the distance between pure initial states which locally resemble an energy 
eigenstate does not decrease but that such states actually stay close to their initial states for all times. In contrast to 
statement (E3) which is a statement about an averaged distance, we obtain an exponentially strong statement. We 
do not require that all energy eigenstates be close to product (as is necessary for (5 in ( E4 1 being small) but only the 



ones which are most relevant for the particular initial state of the system. Furthermore, we allow for mixed initial 
states of the environment. 

The most important improvement, however, is the following: Inequality (E3| gives only a non-trivial statement if 
S < This upper bound decreases exponentially with the number of constituent particles of the system. For a large 



system S we therefore need all energy eigenstates to be very close to product and thus basically need that there be 
no interaction at all. On the other hand, the bound we obtain is non-trivial independently of the size of S. 



Preliminaries 



Theorem C.2 does not only tell us that the time-evolved state of the system ps{t) will very likely be close to a state 
Ts{t) which is independent of the initial state of the environment, but also what this state looks like. If ps(0) = |(/))((/)|s 
and the initial state of the environment is drawn from all oi He it is given by 



Ts{t) = trij [e 



(|(/.)((A|s(8)7r£;)e- 



(E5) 
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If Ts{t) « \(j)){(j)\s for all times, the system will at any time be close to its initial states, for almost all initial states of 
the environment. We will show that this will indeed be the case if those energy eigenstates \Ek)sE which, in a certain 
sense, locally are "most similar" to \(j)){(j)\s, are close to product states. 

In order to make this precise, we need a somewhat more involved notation than in the main part of this article. 
Consider a basis s}i=i^.,.^ds and a basis {| j)^}^^!,...,^^ of He- Both {\i)s<» li)£}j=i,...,ds,i=i,...,dB 

{\Ek}sE}k=i dsdE bases of the Hilbert space Hs (8) He- We consider mappings between these two bases, i.e. 

mappings of the form 

{!,..., dsds} — > {l,...,ds} X {!,..., ds} 

k ^ {ak),m) (E6) 

and define 

h:^F{\Ek),\m) slims) ■ (E7) 

We are interested in how good product states of the form \(j))s \j)E (with a fixed G {1, • ■ . ,ds} and arbitrary j) 

can be approximated by such a mapping k i— )■ (^S,{k),i{k)) . Note that if this product states have high overlap with 

an energy eigenstate, this necessarily implies that the eigenstate is lowly entangled. We restrict to mappings which 
are injective and pick the one which maximizes fk for states of the form |0)s ® \j)E- Formally, we are interested in 
the quantity 

S^cj)) := max min < fk : ^(k) = (j) and fc H> (^(fc), ^(/c) ) is injective ^ . (E8) 

Lemma E.2. With the notation introduc ed a bove, consider an initial state {(j)) s with (j) £ {1, . . . ,ds}- Assume that 
5{(j)) > Then with Ts{t) as defined in (E5) we have 



l|Ts(t)-|</')(0|5|li <4(5(0)v/l- '5(0)2 (E9) 

for all times t. 

We can always obtain an upper bound on ||T5(t) — which is close to if (5(0) is close to 1. By de finiti on 



(E8) the requirement 5(0) > ^ requires that /fc > ^ if ^(fc) — 0- If this condition is fulfilled, the r.h.s. of (E9) is 



smaller than 2 and thus non-trivial. 



3. Theorem and proof 



Combining Lemma E.2 with Theorem C.2 directly yields the statement that for all times there is an exponentially 
small probability that the initial state of the environment was such that the system is further away from its initial 
state than a certain distance. 



Theorem E.3. With the notation introduced above, consider an initial state \4>)s with G {1, 
i5(0) > Let ps{t) denote the evolved state of the system. Then for all times t 



,ds}. Ai 



Pr 

Pe(0) 



lPs(t)-|0)(0|5|li>4<5(0)v/l-<5(0)2 



ds 



-1/3 



< e 



-4/^16 



that 



(ElO) 



where the probability is computed over the choice of an initial state of the environment Pe{^) G S={'He) with arbitrary 
eigenvalues and a Haar distributed eigenbasis (e.g. a pure state from the Haar measure). 

Since the dimension ds grows exponentially with the number of particles in E, we think of the second and third 
summand in the r.h.s. of ( E10[ ) as negligible compared to the first summand. The theorem therefore tells that the 
system will at any time t with very high probability be within a ball of radius 4(5(0)-\/l — S{(f>)'^ (in trace distance 
||. . .||-^) around its initial state |0)(0|5. This radius is small if <5(0) is close to 1, that is, if the enery eigenstates which 
on S are most similar to |0)(0|5 are sufficiently close to product. 

Consider the case where {\i)s}i^i ds eigenbasis of Hs and {\j) E}j=i dE eigenbasis of He- We view 

as a small perturbation of the unperturbed Hamiltonian Hg ®1e + ® He- Then, the unperturbed value of fk 
is 1 and the first order correction is 0. 
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A weakness of both Theorem E.l and Theorem E.3 is that an average or a probabiUty is computed over the choice of 
an initial state of the environment from the Haar measure on aU of He- This allows for the possibility that there is a 
reasonably large subspace T^o^ '^'He which is such that initial states taken from it do not generically lead to memory 
effects in system. One might imagine, for example, that all initial states of E with a sufficiently large temperature do 
not lead to such memory effects, but that still most states from the Haar measure on all of "Hb do. 



We will now proof Lemma E.2 The above theorem then follows straighforwardly from Theorem C.2 and a single 
application of the triangle inequality. 



Proof. In order to shorten our notation we introduce the shorthands (j)s = \(f>){(l)\s, "^ee' = |^')(^|£;£;', ^{k)s = 
\£.{k)){^{k)\s and ^(fc)^ = \^ik)){^{k)\E- Sums with summation index fc or Z go from 1 to dsdE and sums with 
summation index r go from 1 to d^. By use of the assumed injectivity (and hence also bijectivity) of the mapping, 
we have 



(Ell) 



This implies that J2k ^ '*C(^)s ® C(^)-E is a unitary, since 



kl 



A: 
ISE 



(E12) 



We first show that Ts{t) has high fidelity with the state 



trs 



E ® iik)E {cj^s ®7:e)[Y. e+^^'*C(05 ® mE 

\ k ) \ I / 



and then show that this state is identical with (^s- According to (A8) the fidelity can only increase under partial 
traces, that is, it can only decrease if we calculate it for purifications of the actual states, so 

fc J \ I 

^ e-^^^*f(A:)s ® i{k)E^ (0s ® tte) e+^^'*C(05 ® I(Ob 

F'i^(^e-^'''''\Ek){Ek\^ iq^s^^EE') (^e+^'''*\Ei){Ei\^ , 

(j2 e-'^'^'ms ® mE) i^s ® ^EE') Ij2 e+''''*e(Os ® mE] \ . (eis) 



F^ I ITE 



Iye 
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Both these states are pure, so using ( A7| and that \'i}EE' = X^r find 



> 



Msi^EE' (^e+'^^''\Ei){Ei\sE^ (^e-^'^'-'ms^mE^ \^)s\^)ee^ 
}^T.(^\s{r\E (^e+^''''\E,){E,\sE^ (^e-^'^'-'ms ^ i{k)E^ ms\r)E 
}^T.(^\s{r\E (^e+^^''\Ei){E,\sE^ (^e-^^'-^ak)) sim e^ W)^., 



(E14) 



By definition (E8) the requirement 5(0) > ^ requires that 



F- 



SE, \£,ik)) s\iik)) E^ > 2 



(E15) 



if = 1. Since F^ \^\Ei)se, \Uk)) s\i{k)) e} = 1, this also imphes that [44] 



Ei^'{|i?/)si^,|?(fc))5||(fc)>s}<^ 



(E16) 



if = 1- We conclude that 



Y,s^^^(k)F^{\Ei)sE,\m)s\m)E} 
Y,^^,mF^[\Ei)sE,\m)s\m)E} . 



k^l 
< 



(E17) 



k=l 



We split up the sum J2ki ~ Sfe=i + Efc^zt; ^nd use that for a,b E C with |a| > \b\ we have |a + 6| > \a\ — |5| to obtain 



from (E14) 



1 



Using that 



^ [j- Y.^<,>,m^^'^''''''''^'F^{\Ei)sEAm)s\m)E} 

\ ^ k=l 

Y,s^.m^-^^''''~'''^'F^{\Ei)sEAm)s\m)E] 
^ ( ^ E 5^,mF^{\Ei)sE, \m)s\m)E} 

\ ^ k=l 

E W)^' {\Ei)sE, \m)sm))E} 
J2 {\Ei)sE, m))sm)E} = 1 - F'{\E,)sE, m))sm))E} ^i^ii 

l.l^k 



this simpHfies to 



E e-^^'=*e(fc)s ® i{k)E ® E e''''''*c(05 ® mE 

k J \ I J 

\ h k / 

= (iEw-) (2/1-1)) 

Applying the definition of 6{(j)) and the bijectivity of the mapping we finally obtain 



> 



(^(25W'-i)EW)) 



= (2<5(0f-l)' 
As for the second part of the proof, 



trs 



K k / \ I / 

E^^'"^^''^"'''^'^«(^)'^*,?w%.),aol^(^))(^(^)l^ 

kl 

^^^''^^'''^'^^^<i>,mk{i^)m)^i(k),i{i)'i>s 



kl 
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Applying (Ell ) for the first equality and the bijectivity for the second this simplifies to 



\ k J \ I / 



k 



We find 



F{Ts{t),<l>s}>2S{cl>f^l 



and by use of (AlO) 



\\rs{t) - 05lli < 2^1 -F{Ts{t),(j>s) 
= 4<5(^)v/l-<5(0)2 



which is lower than 2 if 5{(f)) is larger than 4; 



(E23) 
(E24) 



(E25) 

□ 



Appendix F: Technical lemmas 
Lemma F.l. Let pA € iS<(Ha) and < 11^ < I^. Then, 



(Fl) 



Proof. This follows directly from [371 Lemma B.24.]. We make the identification T-Lb — span^, {|0)b} and realize that 
in this case 

PAB^ Pa<»\0){0\b (F2) 

for every pab G S<{Hab) and that 

= |0)(0|s (F3) 
for all crB,i-^B G It follows that with the identifications made the condition 

Ia(E)ujb -'nAB{iA(^<yB)'nAB >0 (F4) 

is automatically fulfilled. The assertion then follows by direct application of the lemma cited. □ 

Lemma F.2. Lemma B.25.] Let £ > and pa G S<{'Ha)- Then there exists < 11^ < such that pA G 
B'^ (Uapa^a) and 



H^ax(A)p > HR{A)npn + 2 • log ^ 



Lemma F.3. // 



for an a > then 



dt 



fit)<aK-fity 



fit) < [fio)^ + Kty 



(F5) 



(F6) 



(F7) 



for t > 0. 
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Proof. We introduce the auxiliary function 

h{t) := f{t)^ ~ Kt (F8) 
which by use of the assumption is monotously decreasing since 

Since h{0) = f{0)i we conclude that h{t) < f{Q)i for alH > which is equivalent to the assertion. □ 

Lemma F.4. 123, Theorem 4--1] Let par. G S={%ab) Ta^b a CPTPM with Choi-Jamiolkowski representation 
TA'B and TA'B — dA J PA' ta' b J PA' ■ For any e' > and e" , e'" > 0, suppose that 



\A\R)p + \lC.AA'B), - w2XB)f < - log ^ . (FIO) 



^,'+2e"+e"' + yi,^|^^^ + wCj^Bh - HLJi?)^ < - log ] 

e' 
Then, 

\\T{pAR)-T{pA)®PRh>e . (Fll) 
Remark: Compared to the original version, this version has a slightly improved smoothing parameter {^e instead of 



V2e on the l.h.s. of (FlOl). This is due to applying Lemma A. 2 instead of the original Fuchs-van de Graaf inequality 
(AlO) in the derivation. 

Lemma F.5. Let par G S={TLa <^'Hr)- Then 

UparUUU ^tta^Pr. (F12) 

V(A) 

Proof. Since we are working with finite dimensional Hilbert spaces, we have par S Herm('HA ® 'Hr) = Herm(HA) <8) 
Herm(Hi?). Let a basis of the real vector space Herm('HA) and a basis of Herm(^ij). We assume 

that both bases are normalized, i.e. tr^i t^*'' — ti r ti''^^ — 1. Then par. is uniquely decomposable in the form 



Par = 



Y.c.,rf®^ (F13) 



with ^jj- Cij — 1. Every term of the form Jjj^^-j Ur^^WdU commutes with every V G U(A) (by use of the invariance 
of the Haar measure) and hence by Schur's Lemma and the invariance of the trace under conjugation with a unitary 

Ut^^^UUU ^TTA ■ (F14) 

V{A) 



We find 



/ UparUUu = ^c,j I UTfuUu®j]%^ 

Jv(A) -■ Jv(A) 




(F15) 

where the last equality follows from the definition of the partial trace. □ 
Lemma F.6. ^IS, Theorem 9] For pab G S={TLab) and n E N we have for > I log ^ 



^Hf„i„(A"|B")p»„ > H(^|B)p - ■ 4 Jlog (^-1 ) • log (2-5H»in(^|s)p + 2iHn„x(A|s), ^ (p^g) 
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Remark: The quantity T{A\B)p\p appearing in the original version of [131 Theorem 9] has been replaced by use of 
the inequality between Definition 4 and Lemma 8 on p. 5 of |13| . 

Lemma F.7. Let G S^{Ha) be a pure state. Then, 

HLi„(^)|0>(0| = log = 1^+0 {e') ■ (F17) 



Proof. Consider the state fj.\(f)){(f)\A G S<{'Ha) with < < 1. We have P(|0) (01^1, — ^/l — fi and 
Hmin(^)^|<^)<0| =-logM. Choosing 

P(|0) (0U, {<P\a) =e^fi = l-e^ (F18) 



yields together with the definition of the smooth min-cntropy ( A19) that H^;jj(v4)|0)^0| > log y372 - To see the converse 
assertion, let aA e C 5<('Ha) be such that Hf„i,j(yl)|0)(0| = Hmi„(A)^. Then, 

e>Pm{(t>\A,<JA) 
^ H^.in(^)|0)(0| < log ■ (F19) 



□ 



Lemma F.8. For pA G S={T-La) we have for n> ^ log ^2 



^ H^_(^"),«„ < H(A), + ^ • 4 J log ^ . log (Tdl + 2) . (F20) 



Proof. Let |(/))((/)|^p be a purification of pA. Then by use of (A21 ) and Lemma F.6 we have 

-Hfnax(^")p®" = Hf„i„(A"|P")^»„ 

n n 

< -H(A|P)|^)(^| + ■ 4^1og (^^^ • log (2-5H.„n(A|p)|,H*i + 23H»ax(A|p)|,>(*i + 1^ . (F21) 
The first summand is 

-H(A|P)|^)(^I = -H(AP)|^)(^| + H(P)|^)(^| - H(A)|^)(^| . (F22) 



By use of ( A28 ) we have 

H,„in(A|P)^>-logd^ (F23) 
since we can always find a purifying system P with dp — cLa. Using that 

lA®ap\4>)AP<WA®lp\4>)AP = l (F24) 



for every crp G S={'Hp) we find from (A4) that 

H„,ax(A|P)0 < (F25) 
which yields the assertion. □ 
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